6 research outputs found

    GNUsmail: Open framework for on-line email classification

    Get PDF
    Real-time classification of massive email data is a challenging task that presents its own particular difficulties. Since email data presents an important temporal component, several problems arise: emails arrive continuously, and the criteria used to classify those emails can change, so the learning algorithms have to be able to deal with concept drift. Our problem is more general than spam detection, which has received much more attention in the literature. In this paper we present GNUsmail, an open-source extensible framework for email classification, which structure supports incremental and on-line learning. This framework enables the incorporation of algorithms developed by other researchers, such as those included in WEKA and MOA. We evaluate this framework, characterized by two overlapping phases (pre-processing and learning), using the ENRON dataset, and we compare the results achieved by WEKA and MOA algorithms

    Online evaluation of email streaming classifiers using GNUsmail

    No full text
    Real-time email classification is a challenging task because of its online nature, subject to concept-drift. Identifying spam, where only two labels exist, has received great attention in the literature. We are nevertheless interested in classification involving multiple folders, which is an additional source of complexity. Moreover, neither cross-validation nor other sampling procedures are suitable for data streams evaluation. Therefore, other metrics, like the prequential error, have been proposed. However, the prequential error poses some problems, which can be alleviated by using mechanisms such as fading factors. In this paper we present GNUsmail, an open-source extensible framework for email classification, and focus on its ability to perform online evaluation. GNUsmail’s architecture supports incremental and online learning, and it can be used to compare different online mining methods, using state-of-art evaluation metrics. We show how GNUsmail can be used to compare different algorithms, including a tool for launching replicable experiments

    Aspectos fisiológicos de la remolacha azucarera de siembra otoñal

    No full text
    Conceptos generales del metabolismo; crecimiento y desarrollo; actividades enzimáticas; niveles de adenilatos; efecto del nitrógeno; actividad nitrato reductasa en relación con la nutrición nitrogenada; la fosfoenolpiruvato piruvato carboxilasa; inhibición del espigado; niveles de prolina; respuesta varietal al estrés hídrico e identificación y cuantificación de azúcares

    Estudios territoriales en México

    No full text
    corecore